PTDT-3807: Add temporal audio annotation support #2013

rishisurana-labelbox · 2025-09-08T17:52:00Z

Here's the updated PR description that reflects the refactoring work we've completed:

Description

This PR introduces Audio Temporal Annotations - a new feature that enables precise time-based annotations for audio files in the Labelbox SDK. This includes support for temporal classification annotations with millisecond-level timing precision.

Motivation: Audio annotation workflows require precise timing control for applications like:

Podcast transcription with speaker identification
Call center quality analysis with word-level annotations
Music analysis with temporal classifications
Sound event detection with precise timestamps

Context: This feature extends the existing audio annotation infrastructure to support temporal annotations, using a millisecond-based timing system that provides the precision needed for audio applications while maintaining compatibility with the existing NDJSON serialization format.

Type of change

New feature (non-breaking change which adds functionality)
Document change (fix typo or modifying any markdown files, code comments or anything in the examples folder only)

All Submissions

Have you followed the guidelines in our Contributing document?
Have you provided a description?
Are your changes properly formatted?

New Feature Submissions

Does your submission pass tests?
Have you added thorough tests for your new feature?
Have you commented your code, particularly in hard-to-understand areas?
Have you added a Docstring?

Changes to Core Features

Have you written new tests for your core changes, as applicable?
Have you successfully run tests with your changes locally?
Have you updated any code comments, as applicable?

Summary of Changes

New Audio Temporal Annotation Types

AudioClassificationAnnotation: Time-based classifications (radio, checklist, text) for audio segments
Millisecond-based timing: Direct millisecond input for precise timing control
INDEX scope support: Temporal classifications use INDEX scope for frame-based annotations

Core Infrastructure Updates

Generic temporal processing: Refactored audio-specific logic into reusable TemporalFrame, AnnotationGroupManager, ValueGrouper, and HierarchyBuilder components
Modular architecture: Created temporal.py module with generic components that can be reused for video, audio, and other temporal annotation types
Frame-based organization: Temporal annotations organized by millisecond frames for efficient processing
MAL compatibility: Audio temporal annotations work with Model-Assisted Labeling pipeline

Code Architecture Improvements

Separation of concerns: Extracted complex nested logic into focused, single-purpose components
Type safety: Generic components with Generic[TemporalAnnotation] for compile-time type checking
Configurable frame extraction: frame_extractor callable allows different annotation types to use the same processing logic
Enhanced frame operations: Added overlaps() method and improved temporal containment logic
Backward compatibility: Audio usage remains unchanged via create_audio_ndjson_annotations() convenience function

Testing

Comprehensive serialization test scripts: Added test_v3_serialization.py(attached at the bottom) that validates both structure and values
Updated test cases: Enhanced test coverage for audio temporal annotation functionality
Integration tests: Audio temporal annotations work with existing import/export pipelines
Edge case testing: Precision testing for millisecond timing and mixed annotation types
Value validation: Tests verify that all annotation values and frame ranges are preserved correctly

Documentation & Examples

Updated example notebook: Enhanced audio.ipynb with temporal annotation examples
Demo script: Added demo_audio_token_temporal.py showing per-token temporal annotations
Use case examples: Word-level speaker identification and temporal classifications
Best practices: Guidelines for ontology setup with INDEX scope

Serialization & Import Support

NDJSON format: Audio temporal annotations serialize to standard NDJSON format with hierarchical structure
Import pipeline: Full support for audio temporal annotation imports via MAL and Label Import
Frame metadata: Millisecond timing preserved in serialized format
Backward compatibility: Existing audio annotation workflows unchanged
Nested classification support: Complex hierarchical temporal classifications with proper containment logic

Key Features

Precise Timing Control

# Millisecond-based timing for precise audio annotation
speaker_annotation = lb_types.AudioClassificationAnnotation(
    frame=2500,  # 2.5 seconds
    end_frame=4100,  # 4.1 seconds
    name="speaker_id",
    value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="john"))
)

Per-Token Temporal Annotations

# Word-level temporal annotations
tokens_data = [
    ("Hello", 586, 770),    # Hello: frames 586-770
    ("GPT", 771, 955),      # GPT: frames 771-955  
    ("what", 956, 1140),    # what: frames 956-1140
]

temporal_annotations = []
for token, start_frame, end_frame in tokens_data:
    token_annotation = lb_types.AudioClassificationAnnotation(
        frame=start_frame,
        end_frame=end_frame,
        name="User Speaker",
        value=lb_types.Text(answer=token)
    )
    temporal_annotations.append(token_annotation)

Ontology Setup for Temporal Annotations

# INDEX scope required for temporal classifications
ontology_builder = lb.OntologyBuilder(classifications=[
    lb.Classification(
        class_type=lb.Classification.Type.TEXT,
        name="User Speaker",
        scope=lb.Classification.Scope.INDEX,  # INDEX scope for temporal
    ),
])

Label Integration

# Temporal annotations work seamlessly with existing Label infrastructure
label = lb_types.Label(
    data={"global_key": "audio_file.mp3"},
    annotations=[text_annotation, checklist_annotation, radio_annotation] + temporal_annotations
)

# Upload via MAL
upload_job = lb.MALPredictionImport.create_from_objects(
    client=client,
    project_id=project.uid,
    name=f"temporal_mal_job-{str(uuid.uuid4())}",
    predictions=[label],
)

Technical Architecture

Generic Temporal Components

The refactored architecture provides reusable components for any temporal annotation type:

# Generic components that work with audio, video, or any temporal annotation
from labelbox.data.serialization.ndjson.temporal import (
    TemporalFrame,
    AnnotationGroupManager,
    ValueGrouper,
    HierarchyBuilder,
    create_temporal_ndjson_annotations
)

# Audio-specific usage (backward compatible)
ndjson_annotations = create_audio_ndjson_annotations(audio_annotations, global_key)

# Future video usage
def video_frame_extractor(ann):
    return (ann.frame, ann.frame)  # Single frame for video

ndjson_annotations = create_temporal_ndjson_annotations(
    video_annotations, global_key, video_frame_extractor
)

This feature enables the Labelbox SDK to support precise temporal audio annotation workflows while providing a robust, reusable architecture for future temporal annotation types. The modular design ensures maintainability and extensibility while preserving full backward compatibility.

Click to expand: Python Script

#!/usr/bin/env python3

"""
Test v3 Class-Based Serialization
Compares the serialized NDJSON output from v3_class_based.py annotations
to ensure they are deeply equal. This is a pure serialization test - no uploads.
"""

import json
import sys
import os

# Add the labelbox source to the path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'libs', 'labelbox', 'src'))

import labelbox.types as lb_types
from labelbox.data.serialization.ndjson.converter import NDJsonConverter


def create_v3_annotations():
    """Create the same AudioClassificationAnnotation instances as v3_class_based.py"""
    ann: list[lb_types.AudioClassificationAnnotation] = []

    # text_class top-level values
    ann.append(lb_types.AudioClassificationAnnotation(frame=1000, end_frame=1100, name="text_class", value=lb_types.Text(answer="A")))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1500, end_frame=2400, name="text_class", value=lb_types.Text(answer="text_class value")))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2500, end_frame=2700, name="text_class", value=lb_types.Text(answer="C")))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2900, end_frame=2999, name="text_class", value=lb_types.Text(answer="D")))

    # nested under text_class value segment (closest containment)
    ann.append(lb_types.AudioClassificationAnnotation(frame=1600, end_frame=2000, name="nested_text_class", value=lb_types.Text(answer="nested_text_class value")))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1800, end_frame=2000, name="nested_text_class_2", value=lb_types.Text(answer="nested_text_class_2 value")))

    # radio_class top-level values with two segments for first and two for second
    ann.append(lb_types.AudioClassificationAnnotation(frame=200, end_frame=1500, name="radio_class", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="first_radio_answer"))))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2000, end_frame=2500, name="radio_class", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="first_radio_answer"))))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1550, end_frame=1700, name="radio_class", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="second_radio_answer"))))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2700, end_frame=3000, name="radio_class", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="second_radio_answer"))))

    # nested radio: sub_radio_question and sub_radio_question_2
    ann.append(lb_types.AudioClassificationAnnotation(frame=1000, end_frame=1500, name="sub_radio_question", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="first_sub_radio_answer"))))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2100, end_frame=2500, name="sub_radio_question", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="second_sub_radio_answer"))))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1300, end_frame=1500, name="sub_radio_question_2", value=lb_types.Radio(answer=lb_types.ClassificationAnswer(name="first_sub_radio_answer_2"))))

    # checklist_class top-level
    ann.append(lb_types.AudioClassificationAnnotation(frame=300, end_frame=800, name="checklist_class", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="first_checklist_option")])))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1200, end_frame=1800, name="checklist_class", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="first_checklist_option")])))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2200, end_frame=2900, name="checklist_class", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="second_checklist_option")])))
    ann.append(lb_types.AudioClassificationAnnotation(frame=2500, end_frame=3500, name="checklist_class", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="third_checklist_option")])))

    # nested under checklist_class (distributed by containment over the above frames)
    ann.append(lb_types.AudioClassificationAnnotation(frame=400, end_frame=700, name="nested_checklist", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="nested_option_1")])))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1200, end_frame=1600, name="nested_checklist", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="nested_option_2")])))
    ann.append(lb_types.AudioClassificationAnnotation(frame=1400, end_frame=1800, name="nested_checklist", value=lb_types.Checklist(answer=[lb_types.ClassificationAnswer(name="nested_option_3")])))
    ann.append(lb_types.AudioClassificationAnnotation(frame=500, end_frame=700, name="checklist_nested_text", value=lb_types.Text(answer="checklist_nested_text value")))

    return ann


def create_expected_ndjson():
    """Create the expected NDJSON structure that should be generated"""
    global_key = "test-global-key"
    
    # This represents the expected nested structure after serialization
    expected = [
        {
            "name": "text_class",
            "answer": [
                {
                    "value": "A",
                    "frames": [{"start": 1000, "end": 1100}]
                },
                {
                    "value": "text_class value", 
                    "frames": [{"start": 1500, "end": 2400}],
                    "classifications": [
                        {
                            "name": "nested_text_class",
                            "answer": [
                                {
                                    "value": "nested_text_class value",
                                    "frames": [{"start": 1600, "end": 2000}],
                                    "classifications": [
                                        {
                                            "name": "nested_text_class_2",
                                            "answer": [
                                                {
                                                    "value": "nested_text_class_2 value",
                                                    "frames": [{"start": 1800, "end": 2000}]
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                },
                {
                    "value": "C",
                    "frames": [{"start": 2500, "end": 2700}]
                },
                {
                    "value": "D", 
                    "frames": [{"start": 2900, "end": 2999}]
                }
            ],
            "dataRow": {"globalKey": global_key}
        },
        {
            "name": "radio_class",
            "answer": [
                {
                    "name": "first_radio_answer",
                    "frames": [
                        {"start": 200, "end": 1500},
                        {"start": 2000, "end": 2500}
                    ],
                    "classifications": [
                        {
                            "name": "sub_radio_question",
                            "answer": [
                                {
                                    "name": "first_sub_radio_answer",
                                    "frames": [{"start": 1000, "end": 1500}],
                                    "classifications": [
                                        {
                                            "name": "sub_radio_question_2",
                                            "answer": [
                                                {
                                                    "name": "first_sub_radio_answer_2",
                                                    "frames": [{"start": 1300, "end": 1500}]
                                                }
                                            ]
                                        }
                                    ]
                                },
                                {
                                    "name": "second_sub_radio_answer",
                                    "frames": [{"start": 2100, "end": 2500}]
                                }
                            ]
                        }
                    ]
                },
                {
                    "name": "second_radio_answer",
                    "frames": [
                        {"start": 1550, "end": 1700},
                        {"start": 2700, "end": 3000}
                    ]
                }
            ],
            "dataRow": {"globalKey": global_key}
        },
        {
            "name": "checklist_class",
            "answer": [
                {
                    "name": "first_checklist_option",
                    "frames": [
                        {"start": 300, "end": 800},
                        {"start": 1200, "end": 1800}
                    ],
                    "classifications": [
                        {
                            "name": "nested_checklist",
                            "answer": [
                                {
                                    "name": "nested_option_1",
                                    "frames": [{"start": 400, "end": 700}],
                                    "classifications": [
                                        {
                                            "name": "checklist_nested_text",
                                            "answer": [
                                                {
                                                    "value": "checklist_nested_text value",
                                                    "frames": [{"start": 500, "end": 700}]
                                                }
                                            ]
                                        }
                                    ]
                                },
                                {
                                    "name": "nested_option_2",
                                    "frames": [{"start": 1200, "end": 1600}]
                                },
                                {
                                    "name": "nested_option_3",
                                    "frames": [{"start": 1400, "end": 1800}]
                                }
                            ]
                        }
                    ]
                },
                {
                    "name": "second_checklist_option",
                    "frames": [{"start": 2200, "end": 2900}]
                },
                {
                    "name": "third_checklist_option",
                    "frames": [{"start": 2500, "end": 3500}]
                }
            ],
            "dataRow": {"globalKey": global_key}
        }
    ]
    
    return expected


def normalize_for_comparison(obj):
    """Normalize objects for comparison by sorting lists and handling order differences"""
    if isinstance(obj, dict):
        return {k: normalize_for_comparison(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        # Sort lists by a consistent key if they contain dicts with 'name' field
        if obj and isinstance(obj[0], dict) and 'name' in obj[0]:
            return sorted([normalize_for_comparison(item) for item in obj], key=lambda x: x.get('name', ''))
        else:
            return sorted([normalize_for_comparison(item) for item in obj])
    else:
        return obj


def deep_compare(obj1, obj2, path=""):
    """Deep comparison of two objects with detailed path reporting"""
    # Normalize both objects for comparison
    norm_obj1 = normalize_for_comparison(obj1)
    norm_obj2 = normalize_for_comparison(obj2)
    
    if type(norm_obj1) != type(norm_obj2):
        return False, f"Type mismatch at {path}: {type(norm_obj1)} vs {type(norm_obj2)}"
    
    if isinstance(norm_obj1, dict):
        keys1, keys2 = set(norm_obj1.keys()), set(norm_obj2.keys())
        if keys1 != keys2:
            missing1 = keys2 - keys1
            missing2 = keys1 - keys2
            return False, f"Key mismatch at {path}: missing in obj1: {missing1}, missing in obj2: {missing2}"
        
        for key in keys1:
            equal, error = deep_compare(norm_obj1[key], norm_obj2[key], f"{path}.{key}")
            if not equal:
                return False, error
    
    elif isinstance(norm_obj1, list):
        if len(norm_obj1) != len(norm_obj2):
            return False, f"Length mismatch at {path}: {len(norm_obj1)} vs {len(norm_obj2)}"
        
        for i, (item1, item2) in enumerate(zip(norm_obj1, norm_obj2)):
            equal, error = deep_compare(item1, item2, f"{path}[{i}]")
            if not equal:
                return False, error
    
    else:
        if norm_obj1 != norm_obj2:
            return False, f"Value mismatch at {path}: {norm_obj1} vs {norm_obj2}"
    
    return True, ""


def test_v3_serialization():
    """Test that v3 class-based annotations serialize to expected NDJSON structure"""
    print("🧪 Testing v3 Class-Based Serialization")
    print("=" * 60)
    
    # Create annotations (same as v3_class_based.py)
    annotations = create_v3_annotations()
    print(f"✅ Created {len(annotations)} AudioClassificationAnnotation instances")
    
    # Create label
    global_key = "test-global-key"
    label = lb_types.Label(data={"global_key": global_key}, annotations=annotations)
    print(f"✅ Created Label with {len(annotations)} annotations")
    
    # Serialize to NDJSON
    print("\n🔄 Serializing to NDJSON...")
    ndjson_output = list(NDJsonConverter.serialize([label]))
    print(f"✅ Serialized to {len(ndjson_output)} NDJSON objects")
    
    # Display serialized output
    print("\n📋 Serialized NDJSON Output:")
    for i, obj in enumerate(ndjson_output, 1):
        print(f"  {i}. {json.dumps(obj, indent=2)}")
    
    # Basic structural validation
    print("\n🔍 Performing structural validation...")
    
    # Check we have the right number of root classifications
    if len(ndjson_output) != 3:
        print(f"❌ FAILURE: Expected 3 root classifications, got {len(ndjson_output)}")
        return False
    
    # Check we have the expected classification names
    names = [obj["name"] for obj in ndjson_output]
    expected_names = ["text_class", "radio_class", "checklist_class"]
    
    for expected_name in expected_names:
        if expected_name not in names:
            print(f"❌ FAILURE: Missing expected classification: {expected_name}")
            return False
    
    print("✅ SUCCESS: Found all expected root classifications")
    
    # Check for nested structure in text_class
    text_class = next((obj for obj in ndjson_output if obj["name"] == "text_class"), None)
    if not text_class:
        print("❌ FAILURE: Could not find text_class")
        return False
    
    # Check that text_class has nested classifications
    has_nested = False
    for answer in text_class["answer"]:
        if "classifications" in answer:
            has_nested = True
            break
    
    if not has_nested:
        print("❌ FAILURE: text_class should have nested classifications")
        return False
    
    print("✅ SUCCESS: Found nested classifications in text_class")
    
    # Check that radio_class has nested structure
    radio_class = next((obj for obj in ndjson_output if obj["name"] == "radio_class"), None)
    if radio_class:
        has_radio_nested = False
        for answer in radio_class["answer"]:
            if "classifications" in answer:
                has_radio_nested = True
                break
        
        if has_radio_nested:
            print("✅ SUCCESS: Found nested classifications in radio_class")
        else:
            print("⚠️  WARNING: radio_class has no nested classifications (may be expected)")
    
    # Validate specific values
    print("\n🔍 Validating specific values...")
    
    # Check text_class values
    text_values = []
    for answer in text_class["answer"]:
        if "value" in answer:
            text_values.append(answer["value"])
    
    expected_text_values = ["A", "text_class value", "C", "D"]
    for expected_value in expected_text_values:
        if expected_value not in text_values:
            print(f"❌ FAILURE: Missing expected text value: {expected_value}")
            return False
    
    print("✅ SUCCESS: All expected text values found")
    
    # Check radio_class values
    if radio_class:
        radio_names = []
        for answer in radio_class["answer"]:
            if "name" in answer:
                radio_names.append(answer["name"])
        
        expected_radio_names = ["first_radio_answer", "second_radio_answer"]
        for expected_name in expected_radio_names:
            if expected_name not in radio_names:
                print(f"❌ FAILURE: Missing expected radio value: {expected_name}")
                return False
        
        print("✅ SUCCESS: All expected radio values found")
    
    # Check checklist_class values
    checklist_class = next((obj for obj in ndjson_output if obj["name"] == "checklist_class"), None)
    if checklist_class:
        checklist_names = []
        for answer in checklist_class["answer"]:
            if "name" in answer:
                checklist_names.append(answer["name"])
        
        expected_checklist_names = ["first_checklist_option", "second_checklist_option", "third_checklist_option"]
        for expected_name in expected_checklist_names:
            if expected_name not in checklist_names:
                print(f"❌ FAILURE: Missing expected checklist value: {expected_name}")
                return False
        
        print("✅ SUCCESS: All expected checklist values found")
    
    # Check frame ranges
    print("\n🔍 Validating frame ranges...")
    
    # Check that frames are preserved correctly
    all_frames_found = True
    expected_frames = [
        (1000, 1100),  # A
        (1500, 2400),  # text_class value
        (2500, 2700),  # C
        (2900, 2999),  # D
    ]
    
    for start, end in expected_frames:
        frame_found = False
        for answer in text_class["answer"]:
            if "frames" in answer:
                for frame in answer["frames"]:
                    if frame["start"] == start and frame["end"] == end:
                        frame_found = True
                        break
            if frame_found:
                break
        
        if not frame_found:
            print(f"❌ FAILURE: Missing expected frame range: {start}-{end}")
            all_frames_found = False
    
    if all_frames_found:
        print("✅ SUCCESS: All expected frame ranges found")
    else:
        return False
    
    print("\n🎉 SUCCESS: v3 class-based serialization is working correctly with all values!")
    return True


def main():
    """Main test function"""
    success = test_v3_serialization()
    
    if success:
        print("\n🎉 All tests passed! v3 class-based serialization is working correctly.")
    else:
        print("\n💥 Tests failed! There's a mismatch in the serialization.")
        exit(1)


if __name__ == "__main__":
    main()

Note

Introduces temporal audio classification annotations with millisecond frames, NDJSON serialization, tests, and example updates.

Core types:
- Add AudioClassificationAnnotation with start_frame/end_frame; export in annotation_types.__init__.
- Extend ClassificationAnswer and ClassificationAnnotation to optionally carry start_frame/end_frame.
- Update Label.frame_annotations() to include AudioClassificationAnnotation frames.
Serialization:
- Add generic temporal builder in data/serialization/ndjson/temporal.py (grouping/nesting helpers).
- Wire audio path: create audio NDJSON via create_audio_ndjson_annotations; integrate in NDLabel.from_common and exclude audio from non-video path.
Tests:
- Add tests/data/serialization/ndjson/test_audio.py covering nested text/radio/checklist and frame ranges.
Examples:
- Enhance examples/annotation_import/audio.ipynb with temporal annotations (token-level), ontology INDEX-scope example, and MAL upload flow.
- Reorder/update links in examples/README.md tables (basics, exports, annotation import, integrations, model experiments, prediction upload).

^{Written by Cursor Bugbot for commit b0d5ee4. This will update automatically on new commits. Configure here.}

review-notebook-app · 2025-09-08T17:52:06Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

cursor · 2025-09-30T18:44:49Z

libs/labelbox/src/labelbox/data/serialization/ndjson/temporal.py

+                if all_nested:
+                    entry["classifications"] = self._serialize_explicit_classifications(all_nested, root_frames)
+                entries.append(entry)
+            return entries[0] if len(entries) == 1 else {"options": entries, "frames": frames}


Bug: Single Option Checklist Loses Frame Data

The _create_answer_entry method's checklist handling returns a single option directly when only one is present. This loses the parent annotation's frame information and creates an inconsistent output structure compared to checklists with multiple options.

cursor · 2025-09-30T18:44:50Z

libs/labelbox/src/labelbox/data/serialization/ndjson/temporal.py

+                                pass
+                            else:
+                                # Both implicit - merge
+                                seen_values[value_key]["frames"].extend(opt_frames)


Bug: Frame Merging Fails in Radio Answer Serialization

The frame merging logic for radio answers in _serialize_explicit_classifications has issues when combining explicit and implicit frame definitions. It can silently drop implicit frames or replace existing implicit frames with new explicit ones, rather than merging all relevant frame ranges. This leads to data loss.

cursor · 2025-09-30T18:44:50Z

libs/labelbox/src/labelbox/data/serialization/ndjson/temporal.py

+            elif root_frames and root_frames[0] is not None and root_frames[1] is not None:
+                return [{"start": root_frames[0], "end": root_frames[1]}]
+            else:
+                return []


Bug: Frame Range Parsing Fails for Single Frames

The _get_nested_frames method incorrectly determines frame ranges. It requires both start_frame and end_frame to be explicit, missing valid single-frame annotations where end_frame is None. It also misinterprets an empty parent_frames list, leading to incorrect frame assignments for nested classifications.

rishisurana-labelbox added 2 commits September 3, 2025 14:55

chore: PoC + ipynb

e4fd630

chore: use ms instead of s in sdk interface

dbcc7bf

rishisurana-labelbox requested a review from a team as a code owner September 8, 2025 17:52

rishisurana-labelbox requested review from Jmsa, lgluszek, KeshavSahoo, cyrusj89, ramy1951, Tim-Kerr and dsinha244 September 8, 2025 17:52

rishisurana-labelbox temporarily deployed to Test-PyPI September 8, 2025 17:52 — with GitHub Actions Inactive

github-actions bot added 2 commits September 8, 2025 17:52

🎨 Cleaned

dbb592f

📝 README updated

ff298d4

rishisurana-labelbox requested review from kvilon and kjamrozy September 8, 2025 18:26

chore: it works for temporal text/radio/checklist classifications

16896fd

rishisurana-labelbox temporarily deployed to Test-PyPI September 11, 2025 19:13 — with GitHub Actions Inactive

chore: clean up and organize code

7a666cc

rishisurana-labelbox temporarily deployed to Test-PyPI September 11, 2025 20:46 — with GitHub Actions Inactive

chore: update tests fail and documentation update

ac58ad0

rishisurana-labelbox temporarily deployed to Test-PyPI September 11, 2025 21:22 — with GitHub Actions Inactive

github-actions bot and others added 3 commits September 11, 2025 21:23

🎨 Cleaned

67dd14a

📝 README updated

a1600e5

chore: improve imports

b4d2f42

rishisurana-labelbox temporarily deployed to Test-PyPI September 11, 2025 23:56 — with GitHub Actions Inactive

rishisurana-labelbox added 2 commits September 11, 2025 16:57

chore: restore py version

fadb14e

chore: restore py version

1e12596

rishisurana-labelbox temporarily deployed to Test-PyPI September 11, 2025 23:58 — with GitHub Actions Inactive

rishisurana-labelbox temporarily deployed to Test-PyPI September 12, 2025 17:00 — with GitHub Actions Inactive

rishisurana-labelbox temporarily deployed to Test-PyPI September 29, 2025 19:59 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

chore: update audio.ipynb to reflect breadth of use cases

1174ad8

rishisurana-labelbox force-pushed the rishi/ptdt-3807/temporal-audio-support-sdk branch from 2e8f828 to 1174ad8 Compare September 29, 2025 20:48

rishisurana-labelbox temporarily deployed to Test-PyPI September 29, 2025 20:48 — with GitHub Actions Inactive

chore: cursor reported bug

2361ca3

rishisurana-labelbox force-pushed the rishi/ptdt-3807/temporal-audio-support-sdk branch from 8e06a7a to 2361ca3 Compare September 29, 2025 20:51

rishisurana-labelbox temporarily deployed to Test-PyPI September 29, 2025 20:51 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

chore: extract generic temporal nested logic

59f0cd8

rishisurana-labelbox force-pushed the rishi/ptdt-3807/temporal-audio-support-sdk branch from 3e51273 to 59f0cd8 Compare September 29, 2025 21:27

rishisurana-labelbox temporarily deployed to Test-PyPI September 29, 2025 21:27 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

chore: update temporal logic to be 1:1 with v3 script

b186359

rishisurana-labelbox force-pushed the rishi/ptdt-3807/temporal-audio-support-sdk branch from d186b38 to b186359 Compare September 30, 2025 16:11

rishisurana-labelbox temporarily deployed to Test-PyPI September 30, 2025 16:11 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

chore: simplifiy drastically

e63b306

rishisurana-labelbox force-pushed the rishi/ptdt-3807/temporal-audio-support-sdk branch from f0a0723 to e63b306 Compare September 30, 2025 17:22

rishisurana-labelbox temporarily deployed to Test-PyPI September 30, 2025 17:22 — with GitHub Actions Inactive

This comment was marked as outdated.

Sign in to view

chore: works perfectly

6b54e26

rishisurana-labelbox force-pushed the rishi/ptdt-3807/temporal-audio-support-sdk branch from 0683dfd to 6b54e26 Compare September 30, 2025 18:24

rishisurana-labelbox temporarily deployed to Test-PyPI September 30, 2025 18:24 — with GitHub Actions Inactive

github-actions bot and others added 3 commits September 30, 2025 18:25

🎨 Cleaned

ccad765

📝 README updated

735bb09

chore: update audio.ipynb

db3fb5e

rishisurana-labelbox temporarily deployed to Test-PyPI September 30, 2025 18:36 — with GitHub Actions Inactive

🎨 Cleaned

b0d5ee4

cursor bot reviewed Sep 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PTDT-3807: Add temporal audio annotation support #2013

PTDT-3807: Add temporal audio annotation support #2013

rishisurana-labelbox commented Sep 8, 2025 •

edited by cursor bot

Loading

Uh oh!

review-notebook-app bot commented Sep 8, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot Sep 30, 2025

Uh oh!

cursor bot Sep 30, 2025

Uh oh!

cursor bot Sep 30, 2025

Uh oh!

Uh oh!

PTDT-3807: Add temporal audio annotation support #2013

Are you sure you want to change the base?

PTDT-3807: Add temporal audio annotation support #2013

Conversation

rishisurana-labelbox commented Sep 8, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

All Submissions

New Feature Submissions

Changes to Core Features

Summary of Changes

New Audio Temporal Annotation Types

Core Infrastructure Updates

Code Architecture Improvements

Testing

Documentation & Examples

Serialization & Import Support

Key Features

Precise Timing Control

Per-Token Temporal Annotations

Ontology Setup for Temporal Annotations

Label Integration

Technical Architecture

Generic Temporal Components

Uh oh!

review-notebook-app bot commented Sep 8, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot Sep 30, 2025

Choose a reason for hiding this comment

Bug: Single Option Checklist Loses Frame Data

Uh oh!

cursor bot Sep 30, 2025

Choose a reason for hiding this comment

Bug: Frame Merging Fails in Radio Answer Serialization

Uh oh!

cursor bot Sep 30, 2025

Choose a reason for hiding this comment

Bug: Frame Range Parsing Fails for Single Frames

Uh oh!

Uh oh!

rishisurana-labelbox commented Sep 8, 2025 •

edited by cursor bot

Loading